AITopics | lexical analysis

Collaborating Authors

lexical analysis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Lexical Analysis of online Reviews on Human-AI Interactions

Arbab, Parisa, Fang, Xiaowen

arXiv.org Artificial IntelligenceNov-18-2025

This study focuses on understanding the complex dynamics between humans and AI systems by analyzing user reviews. While previous research has explored various aspects of human-AI interaction, such as user perceptions and ethical considerations, there remains a gap in understanding the specific concerns and challenges users face. By using a lexical approach to analyze 55,968 online reviews from G2.com, Producthunt.com, and Trustpilot.com, this preliminary research aims to analyze human-AI interaction. Initial results from factor analysis reveal key factors influencing these interactions. The study aims to provide deeper insights into these factors through content analysis, contributing to the development of more user-centric AI systems. The findings are expected to enhance our understanding of human-AI interaction and inform future AI technology and user experience improvements.

artificial intelligence, data mining, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.54941/ahfe1005622

2511.1348

Country: North America > United States (0.14)

Genre:

Research Report (0.82)
Overview (0.69)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

1b84c4cee2b8b3d823b30e2d604b1878-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 08:26:41 GMT

artificial intelligence, category, cell phone, (18 more...)

Neural Information Processing Systems

Industry: Law Enforcement & Public Safety (0.30)

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training

Kloots, Marianne de Heer, Mohebbi, Hosein, Pouw, Charlotte, Shen, Gaofei, Zuidema, Willem, Bentum, Martijn

arXiv.org Artificial IntelligenceJul-11-2025

How language-specific are speech representations learned by self-supervised models? Existing work has shown that a range of linguistic features can be successfully decoded from end-to-end models trained only on speech recordings. However, it's less clear to what extent pre-training on specific languages improves language-specific linguistic information. Here we test the encoding of Dutch phonetic and lexical information in internal representations of self-supervised Wav2V ec2 models. Pre-training exclusively on Dutch improves the representation of Dutch linguistic features as compared to pre-training on similar amounts of English or larger amounts of multilingual data. This language-specific advantage is well-detected by trained clustering or classification probes, and partially observable using zero-shot metrics.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2025-1526

2506.00981

Country:

Europe (0.28)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Polarization and Morality: Lexical Analysis of Abortion Discourse on Reddit

Stanier, Tessa, Shin, Hagyeong

arXiv.org Artificial IntelligenceJun-29-2024

This study investigates whether division on political topics is mapped with the distinctive patterns of language use. We collect a total 145,832 Reddit comments on the abortion debate and explore the languages of subreddit communities r/prolife and r/prochoice. With consideration of the Moral Foundations Theory, we examine lexical patterns in three ways. First, we compute proportional frequencies of lexical items from the Moral Foundations Dictionary in order to make inferences about each group's moral considerations when forming arguments for and against abortion. We then create n-gram models to reveal frequent collocations from each stance group and better understand how commonly used words are patterned in their linguistic context and in relation to morality values. Finally, we use Latent Dirichlet Allocation to identify underlying topical structures in the corpus data. Results show that the use of morality words is mapped with the stances on abortion.

abortion, foundation, prochoice, (16 more...)

arXiv.org Artificial Intelligence

2407.00455

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Towards Algorithmic Fidelity: Mental Health Representation across Demographics in Synthetic vs. Human-generated Data

Mori, Shinka, Ignat, Oana, Lee, Andrew, Mihalcea, Rada

arXiv.org Artificial IntelligenceMar-25-2024

Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we need an understanding of how different demographics are represented in it. In our paper, we analyze the potential of producing synthetic data using GPT-3 by exploring the various stressors it attributes to different race and gender combinations, to provide insight for future researchers looking into using LLMs for data generation. Using GPT-3, we develop HEADROOM, a synthetic dataset of 3,120 posts about depression-triggering stressors, by controlling for race, gender, and time frame (before and after COVID-19). Using this dataset, we conduct semantic and lexical analyses to (1) identify the predominant stressors for each demographic group; and (2) compare our synthetic data to a human-generated dataset. We present the procedures to generate queries to develop depression data using GPT-3, and conduct analyzes to uncover the types of stressors it assigns to demographic groups, which could be used to test the limitations of LLMs for synthetic data generation for depression data. Our findings show that synthetic data mimics some of the human-generated data distribution for the predominant depression stressors across diverse demographics.

african american, dataset, stressor, (15 more...)

arXiv.org Artificial Intelligence

2403.16909

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Maryland (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Towards Lexical Analysis of Dog Vocalizations via Online Videos

Wang, Yufei, Zhang, Chunhao, Huang, Jieyi, Wu, Mengyue, Zhu, Kenny

arXiv.org Artificial IntelligenceSep-21-2023

Deciphering the semantics of animal language has been a grand challenge. This study presents a data-driven investigation into the semantics of dog vocalizations via correlating different sound types with consistent semantics. We first present a new dataset of Shiba Inu sounds, along with contextual information such as location and activity, collected from YouTube with a well-constructed pipeline. The framework is also applicable to other animal species. Based on the analysis of conditioned probability between dog vocalizations and corresponding location and activity, we discover supporting evidence for previous heuristic research on the semantic meaning of various dog sounds. For instance, growls can signify interactions. Furthermore, our study yields new insights that existing word types can be subdivided into finer-grained subtypes and minimal semantic unit for Shiba Inu is word-related. For example, whimper can be subdivided into two types, attention-seeking and discomfort.

dog vocalization, lexical analysis, online video

arXiv.org Artificial Intelligence

2309.13086

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.40)

Add feedback

Getting Down to Basics

Communications of the ACMMay-25-2021, 15:40:56 GMT

Writing the code to make a computer perform a particular job could be a Herculean task, back in the 1950s and 60s. "In the early 1950s, people did numerical computation by writing assembly language programs," says Alfred V. Aho, professor emeritus of computer science at Columbia University. "Assembly language is a language very close to the operations of a computer, and it's a deadly way to program. Of course, people can program at higher levels of abstraction, but that requires translating the higher-level language into a more basic set of instructions the machine can understand. Compilers that efficiently perform that translation exist nowadays in large part due to the work of Aho and Jeffrey D. Ullman, professor emeritus of computer science at Stanford University. Their contribution to both the theory and practice of computer languages has earned them the 2020 ACM A.M. Turing Award. "Compilers are responsible for generating the software that the world uses today, these trillion ...

computer, computer science, ullman, (15 more...)

Communications of the ACM

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts > Middlesex County > Lowell (0.05)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Annotating Protein Function through Lexical Analysis

Nair, Rajesh, Rost, Burkhard

AI MagazineMar-15-2004

The rate at which expert annotators add the experimental information into more or less controlled vocabularies of databases snails along at an even slower pace. Most methods that annotate protein function exploit sequence similarity by transferring experimental information for homologues. A crucial development aiding such transfer is large-scale, work- and management-intensive projects aimed at developing a comprehensive ontology for gene-protein function, such as the Gene Ontology project. Some of these tools target parsing controlled vocabulary from databases; others venture at mining free texts from MEDLINE abstracts or full scientific papers.

artificial intelligence, lexical analysis, text processing, (7 more...)

AI Magazine

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.62)

Add feedback

Comments on lexical analysis

Miller, G. A.

ClassicsFeb-1-1975

Did you know your Organization can subscribe to the ACM Digital Library?

artificial intelligence, lexical analysis, natural language, (1 more...)

Classics

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.40)

Add feedback